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Abstract. We prove that a random word of length n over a A:-ary hxed alphabet contains, on expecta¬ 
tion, Q{y/n) distinct palindromic factors. We study this number of factors, E(n, k), in detail, show¬ 
ing that the limit lim„_>oo E(n, k)/^/n does not exist for any k > 2, liminf„^oo E(n, k)/y/n = 
0(1), and limsup^^Qo E{n,k)/^/n = Q{\fk). Such a complicated behaviour stems from the 
asymmetry between the palindromes of even and odd length. We show that a similar, but much 
simpler, result on the expected number of squares in random words holds. We also provide some 
experimental data on the number of palindromic factors in random words. 


1. Introduction 

Palindromes are among the most important and actively studied repetitions in words. Recall that a word 
w = ai • • • a„, is a palindrome if oi • • • • • • ai. In particular, all letters are palindromes; the 

empty word is also considered as a palindrome, but throughout this paper we do not count it. Palindromes 
are objects of intensive study since 1970s. One direction of this study is formed by different counting 
problems; see, for example, ||9l, where the asymptotic growth of the language of palstars (words that 
are concatenations of even-length palindromes) is found. An important group of problems within this 
direction concerns the possible number of distinct palindromic factors, or subpalindromes, in a word. 
We call this number palindromic richness. 

Clearly, for the words containing k different letters the lower bound for their palindromic richness is 
k.lfk > 2, then this bound is sharp, since the infinite periodic word (ai • • • ak)‘^, where ai,..., Ofc are 
different letters, has no subpalindromes except letters. For k = 2 the lower bound is less straightforward: 
the minimum richness of an infinite word is 8 and the minimum richness of an aperiodic infinite word 
is 10 Ej. (Moreover, the minimum richness of a finite word of length > 9 is 8.) On the other hand, the 
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maximum richness of an n-letter word over any alphabet is n, as was first observed in |T|. Such “rich” 
words are objects of intensive study (see, e.g., fj])- Still, little is known about the number of rich words 
of a given length. Currently, the best lower bound on the number of binary rich words is of the form 
where p{n) is a polynomial and C ^ 37 Q. In the same paper, it was conjectured that this number 

is upper bounded by while the best proved upper bound is of order Anyway, most of the 

words are not rich, and it is quite interesting to see how the palindromic richness behaves in the generic 
case. We will show, in a straightforward way, that any richness between the two extremums is reachable: 

Proposition 1.1. Any number between 8 and n in the binary case, and between k and n in the k-ary case 
with A: > 2 is the palindromic richness of some word of length n. 

So, the following question is quite natural: 

what is the expected palindromic richness of a random word of length n? 

The following theorem, which is our main result, provides a detailed answer to this question. Note that 
the bigger is the alphabet, the less probable is that a random word will be a palindrome; so, statements 3 
and 4 of this theorem seem rather unexpected. 

Theorem 1.1. Let k>2. 

(1) The expected palindromic richness E(n, k) of a random fe-ary word of length n is Q{s/n) as n ^ oo 
with k fixed. 

(2) The ratio has no limit as re —oo with k fixed. 

(3) The function C_{k) = liminf^^oo ©(1) as A: —>■ oo. 

(4) The function C{k) = limsup^^oo Q{s/k) as A: —)• oo. 

We also give more precise theoretical estimation of the quantities C_{k) and C{k) for some alphabets 
and compare them to the results of our experiments. Finally, we show that our technique allows one to 
get, in a much easier way, the bound 0(yTi) on the number of squares in a random word. 

The text is organized as follows. Section 2 contains notation, definitions, and the proof of Proposi¬ 
tion o In Sections 3-5 we prove Theorem lI.il In Sect. 3, we prove the upper bound 0{s/n) and find 
the range of lengths, containing the main part of all distinct palindromic factors. Then in Sect. 4-5 we 
study the probability of getting a palindromic factor of a given length from a prescribed range, using the 
results of Guibas and Odlyzko |[5ll6l on factor avoidance. The final Sect. 6 is devoted to numerical studies 
and to extending our methods to counting the expected number of squares instead of palindromes. 


2. Preliminaries 

We study non-empty words over finite alphabets, using the array notation w = w[\..n] when appropriate 
and writing |m| for the length of w. Any word w[i..j], where 1 < i < j < re, is a factor of w, a factor of 
the form u)[l..j] (resp., w[i..n]) is called a prefix (resp., a sujfix) of w. A square is any word of the form 
ww. By we denote the right-infinite word obtained by concatenation of an infinite sequence of copies 
of the word u. 
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A word satisfying = w[n—i] for alH = 1,... , n, is a palindrome. Palindromic richness of a 
word w is the number of distinct palindromes which are factors of w. 

By a random k-ary word of length n we mean the random variable equidistributed among all k- 
ary words of length n. The expected palindromic richness E(n, k) of this random word is the main 
characteristic studied in this paper. 

Throughout the paper, the notation log always stands for the base k logarithm; the natural logarithm 
is denoted by In. 

Proof of Proposition ll.lt 

Let k > 2 and w = {ai - ■ ■ a^)^. The word a^^^w[l..n—l+k] of length n has exactly I palindromes: all 
letters plus the palindromes a\ for i = 2,..., l—k+1. Since I can be an arbitrary integer between k and 
n, we are done with this case. 

Now consider the binary alphabet {0,1}. The infinite word u = (001101)'^ has exactly 8 palindromic 
factors: 0,1,00,11,010,101,0110,1001. All of them appear in rt[l..9]. Then the word 0^“^u[l..n—f+8] 
of length n has exactly I palindromes for any I = k,... ,n — 1: those of u plus 0^,..., 0^“®. Since the 
words of length n and richness n exist (for example, 0"^), we get the desired result. 


3. A simple upper bound 

The aim of this section is to prove that E(n, k) = 0(y/n) for any fixed k and to show that the most 
part of palindromic factors in a word of length n has the length close to log n. The first two lemmas are 
straightforward. 

Lemma 3.1. The number of distinct k-ary palindromes of length m is Pal(A:, m) = . 


Proof: 

The mentioned quantity is the number of ways to choose the first [m/2] letters of a word of length m. 
If this word is a palindrome, the remaining letters are determined uniquely. □ 

Lemma 3.2. The expected number of palindromic factor^ of length m in a fe-ary word of length n is 

E(n,A:,m) = 

Proof: 

The probability for a k-ary word of length m to be a palindrome is by Lemma [3T] This 

probability obviously coincides with the expected number of palindromic factors of length m in the fixed 
position of a word of length n. Now the lemma follows by the linearity of expectation, because a word 
of length n has n—m+1 factors of length m. □ 


The following combinatorial lemma is used in the proof of Lemma [34l 

l)k—c 


Lemma 3.3. 


*Not necessarily distinct! 
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Proof: 

The following sequence of transformations holds: 


oo 

i=c 



(c + — x) + 

(1 — x )2 


(c + l)x'^ — (c + l)A: —c 

—— = 1 “^ = 


□ 


In the rest of this section we prove the following upper bound on the expected palindromic richness. 
Some notions and formulas from the proof will be then used throughout the rest of the paper. 

Lemma 3.4. For any fixed k > 2 one has E(n, k) < y/n{s/k + 0(1)). 

Proof: 

Let tu be a word picked up uniformly at random from the set of all fc-ary words of length n. It is clear 
that the expected number E{w,m) of distinct palindromic factors of length m in w can exceed neither 
Pal(A:, m) nor E(n, k, m). So we have the following upper bound: 

n 

E{n,k) < min{Pal(A:, m), E(n, k, m)}. (1) 

m=0 

Since the formulas given in Lemmas 13.11 and 13.21 are asymmetric with respect to the parity of m, it is 
convenient to split the sum in ([Til into two sums, corresponding to even and odd values of m, respectively, 
and compute them separately. So we have 

^ Y) _ Q.Tn —l— 1 

Pal(/c,2m) =/c™, E{n,k,2m) = - — -, (2a) 

Pal(/c, 2m+l) =E(n, A:, 2m+l) = ———, (2b) 

and then we can write 
E(n, k) = Ee(n, k) + Eo(n, k) 

Ln/2J ^ L(n-1)/2J 

< min{Pal(fc, 2m), E(n, fc, 2m)} + min{Pal(A:, 2m+l), E(n, k, 2m+l)}. (3) 

m=0 m=0 

The graphs of (l2al) and (l2bl) as functions of m (for k and n fixed) are drawn in Fig. [T] 

So, in each case we have to find the point of intersection of two graphs and then sum up all values 
of Pal to the left of this point and all values of E to the right of this point. We start with even-length 
palindromes. Recall that log denotes the base k logarithm. 
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Figure 1. The graphs of Pal and E for even-length (left) and odd-length (right) palindromes. 


The intersection point pe = Pe{n, k) is given by the equation = n — 2m -f 1, so pe ~ 
Using standard transformations and the Maclaurin series for ln(l — x), we get a more precise estimate: 

log(n - 2pe + 1) _ log(n - log(n - 2pe -P 1) + 1) 


Pe = 

= i • ( log n + log (1 - 


2 2 

log(n — 2pe + 1) — IXX logn log(n — 2pe + 1) — 1 ^/log^rr 


n 


2 (2 In /t) • n 

log n log n — 1 






2 (2 In A:) • n 

Replacing geometric sequences by geometric series and applying Lemma [331 we obtain 

[pej \n/2\ 


i / log^ n \ 

— + (4) 

■ n \ / 


Ee{n,k)<^k^+ ^-< 


n + 1 


m=0 m=[pej-l-l 




2 ^ m + 1 


1-1/A: A:LP'=J+U1 - 1/A:) k ^ k^ 

m=[pej 

^ n + 1 2([peJ + 1)A: — 2[pe 


A: — 1 A:LP'=J (A: — 1) A;Lp=J (A; — 1)^ 

Using dUl and the Maclaurin series for the exponential function, we compute 


f\Pe\ — 


kP- 


. (^+0(1^) _ . (1 - 12^ + 0(M^)) 


A:{Pe} f{v<^} 

Substituting (jQ and (jd]) into Q, we finally obtain 


-Un • k} ^P‘^^ y/n ■ k^P<^^ 

Ee(n, k) < ^^^ + O 


k-l 


k-l 




logn 


(5) 


( 6 ) 


(V) 


Note that the constant inside the 0-term can be chosen independent of k. Now we proceed with the 
odd-length palindromes. The following property of the intersection point po = Po{n, k) is quite useful. 
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Lemma 3.5. pe = Po + 1/2. 

Proof: 

Recall that po is the root of the equation = n — 2po, so po = log s/n — 2po — 1/2. Similarly, 

Pe = log \/n — 2pe + 1. Then 


Pe-Po = -^+ log 


n — 2pe + 1 
n - 2po 


Denoting the logarithm in ([S]) by A, we obtain 


( 8 ) 


A = log 



2po - 1 - 2A + 1 
n - 2po 


log 



2A 

n-2po' 


(9) 


If A > 0, then the square root in (|9l) is less than 1, implying A < 0. Similarly, if A < 0, then the square 
root in (|9l) is greater than 1, implying A > 0. These contradictions show that the only possible case is 
A = 0, whence the result. □ 


Lemma lT5] and (01) give us 


Po = 


log n — 1 log n — 1 


(2 In k) ■ n 


+ 0 


log^ n 




( 10 ) 


Similar to the even case we obtain 


\j>o\ 


L(n-1)/2J 


i_ - ,. m 11 ^ n — 2m 

Eo(n, k) = \^ k^+^ + ^ - ttt + 


n 


m=0 m=[poJ+l 


From ® and Lemma [331 we have 




2 ^ m +1 


1-1/A: A:Lp°J+H1 - 1/A:) k ^ k'^ 

m=[po\ 

/jLpoJ+ 2 2([poJ + 1)A: — 2[poJ 

A: — 1 A:Lp°J (A; — 1) A:Lp°J (A; — 1)^ 


AjLpoJ = 


, (1 12^ + 0(t^)) 


/j{po}+l/2 


Substituting (fT2l) and (flOl) into (fTTl) . we finally get 


y/n-k^^'^ 1p°1 • A:^/^+1^°1 /logn-s/k 

Eo{n, k) < -;-;-^-;-;-h O' 


A:- 1 


A:- 1 


y/n 


and from (O and ([T3]) 


( 11 ) 


( 12 ) 


(13) 


, , /n-(Vfc-(A:i"^^’°^ + A:lP°l) + (A:i-fr-l + A:lP=l)) f\ogn-Vk 

E(n,A;) < --+ 0' 


n 


(14) 


whence the result. 


□ 
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Remark 3.1. According to Lemma [331 the expressions in internal parentheses in ([T4l) oscillate in an¬ 
tiphase. So, if {po} Ri 0 (i.e., n is slightly bigger than an odd power of k), the bound ([14]) approaches its 
maximum and approximates to y/ri{s/k + f^), and if {pe} ~ 0 (i.e., n is slightly bigger than an even 
power of k), this bound goes to minimum values close to s/n{‘i + 

The given upper bounds leave an impression that for any fixed k the function E(n, k) oscillates 
between its low values close to Cy/n for some absolute constant C and its high values close to Ds/^ 
for some absolute constant D. But the bound ([T4l) is somewhat imprecise, because the initial bound ([T]) 
is generous enough. Indeed, if the number of palindromic factors of length m in a word is greater than 
the number of distinct palindromes of this length, still some palindromes of length m can be missing 
from this word. Similarly, if the number of these factors of length m in a word is less than the number 
of distinct palindromes of this length, some of the factors can repeat, decreasing the number of distinct 
palindromes. Since the probability of an event “to contain a given palindrome of length m” depends 
not only on n, k, and m, but also on the internal structure of the palindrome, we cannot obtain a lower 
bound on the expected number of palindromic factors just using standard balls-and-bins considerations. 
Instead, we use a more powerful technique. This technique is based on the asymptotic estimates of the 
number of words of length n avoiding a given fixed factor. 

4. Lower bound through avoidance of factors 

Below we assume that a fc-ary alphabet S is fixed, k > 2, all words are over S, and V is the set of all 
palindromes over S. We say that a word u avoids a word m if m is not a factor of u. Let A^{n) be the 
number of words of length n avoiding the word w and let E(n, /c, m) be the expected number of distinct 
palindromes of length m in the words of length n. 

Lemma 4.1. 

E(n,/c,m)= ^ ( 15 ) 

\w\=m^ 

w£V 


Proof: 

Consider the function on words that equals 1 if a word contains a given length m palindrome w and 0 
otherwise. Applied to a random word, this function becomes a random variable with the expectation 
(l — ). This expectation is exactly the probability for a random word of length n to contain w. 

Clearly, by the linearity of expectation, E(n, k, m) is the sum of such expectations over all palindromes 
of length m. □ 

To make use of ([TSll for the estimation of E(n, k) = X]m=i have to estimate the 

number of words avoiding a given palindrome. For this purpose, we use the technique developed by 
Guibas and Odlyzko in [[510I. To formulate some of their results, we need to introduce some important 
notions. Recall that a word u is a border of a word m if tt is both a prefix and a suffixH of w. With 
each word w of length m we associate its border array, which is a word w[l..m\ over {0,1} such that 
w[i] = 1 if and only if w has a border of length m—i+1. The border array can be interpreted as the array 

^This definition deviates slightly from the usual one, which excludes the trivial case u = w. 
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of coefficients of a real-valued polynomial /^(x) such that u)[f] is the coefficient of We refer to 

this polynomial with 0-1 coefficients as the border polynomial of w. Since u)[l] = 1, this polynomial 
has degree m—1. 


Example 4.1. The word w = aabaabaa has non-empty borders w, aabaa, aa, and a. Its border array w 
equals 10010011 and its border polynomial is fw{x) = + x'^ + x + 1. 

Theorem 4.1. ( dllSI) 

1) The number Ayj{n) of words of length n avoiding a given word w of length m > 3 is 


where Oy^ 


k 


A^(n) = C'^C +0(1-7"), 


_ fUk) 

fw{k) fUk) 


Cy 


1 

i-(fc-0)2/;(0)' 


(16) 

(17) 


2) The condition fu{k) < fw{k) implies Au{n) < Au,{n) for all n > 0 and, in particular, 6u < 9w 


Lemma 4.2. 1) For words u and w, one has fu{k) < fw{k) if and only if u < w, where u and w are 
treated as binary numbers. 

2) For any m, max|^|=^ 6y, = Oa^- 

Proof: 

1) The comparison of u and w as binary numbers has the same result as the comparison of them as fc-ary 
numbers; but the number having w as its A:-ary notation is exactly fw{k) by the definition of fw{x). 

2) The border array of a™ equals 1™ and thus represents the biggest number that can be written in 
binary in m bits. Now the statement follows from statement 1 and Theorem 14.11 21. □ 

Applying Lemma l4~2] and Theorem 14.11 21. we see that 

Aw{n) < (n) for any palindrome w of length m. (18) 

Thus we can get the lower bound on the expected number of palindromic factors replacing w in ([T5]) with 
the word v = a^. We have fv{x) = -|- x”^“^ -|- • • • -h x -|- 1 = (x™ — l)/(x — 1), as we can assume 

X > 1 since k,9y > 1. Hence, 

, /x™ — 1\' mx'^~^{x — 1) — x^ + 1 (m — l)x"^ — mx^~^ + 1 

= y-^) = - 

Substituting these formulas into (fTTI) and performing straightforward transformations, we get 


k — 1 (k — l){{m — lik"^ — mk'^ ^ + l) 
ey = k- — -r - ^- k + o 


= k- 


k^ -1 
k—1 k—1 „ 

+ -T^+O 


(k^ - 1)3 


m 

P" 


L k 


k2v 

= k- 


1 


k-1 


k-1 


(fe™ -h 3)(fc - l)((m - l)k^ - mk'^-^ + l) 




m — 1 2m — 1 

+ 


k2r 


^4m 

m 

^2m+l 


+ 0 


+ 0 


km + m^ 

j^3m 


k — 1 mik — lY ^/km + m? 
= k -- 1 ^ +0 




^2m+l 


kSv 


m 


( 20 ) 
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a = 


1 


= i + (k- 9,ff'{e) + o{{k- e,)^f'\e)) 


i-ik-e^y/ae) 

k — 1 „ / m 


= 1 + 


k^ 




2/(m- 1)0™ + 1 


(0. - 1)2 


+ o( 


m 


^ I m 
= 1 + 0 — 

/c™ 


( 21 ) 


Now we use (fT^ to estimate the sum in (fT5l) . 

Efn k) 

Since our goal is to estimate the ratio , we do not need to cope with arbitrary m. Namely, we 
put 

m = 2{pe + e) = 2{po + e) + 1, where e = 0(1). (22) 

Thus, m = log n + 0(1). This is sufficient for reaching the declared goal because of the following 


Remark 4.1. If m — logn = g{n) for any growing function g, then E{m,k,n) = o{y/n), and then 
Ylm=\og{n)+g{n) E(m, fc, u) = o{s/n) (scc Fig. [B; the same observation is true for the symmetric case 
m — logn = —g{n). 

From (1221) and (lUl we get /c™ = n • • (1 — 0 (i^Si!i))^ (j^ — <9(12^)^ and 


Gv _ (fc - 1)(1+ 0(^)) /logn 
k n-k^+^^ ^ -2 


= 1 - 


k-1 


/ n 

Substituting (1 — ajnY^ = e““(l + 0(a/n)) for big n, we have 


• /c1+2£ 


+ 0 


log n 


n^ 


(23) 


i-ci^r=i-fi+of 


'logn 


\ n 


1 - 


k-\ 


n 


/.l+2e 




= 1 - 


(,,0(!fi))e-+^-<^)(l.0(i)) 


l+0(—=1 — e _|_ q 


log n 


n 


(24) 


Finally, from (fTSl) we obtain 

E(n, k, m) > Pal(/c, m) • ^1 — 0„ 


f/c^(l-e-^)V^ + o(i^), mis 


k^l 1 — e 'j \/%n + O 


log n 

V+ 


even. 


m is odd. 


(25) 


In particular, we proved the lower bound of order yTi for E(n, k), finishing fhe proof of Theorem lTll ll. 

, _ k—1 , 

Furfhermore, consider fhe function g{k,e) = k^[l — e Ttt^). Clearly, g{k,0) = f2(l). For odd m, 
e = 0 means fhaf po is infeger. By fhe definifion of po, for po = iwe have n = rii = A:2*+i + 2i. So if we 
fake fhe sequence {nj}f^ and m = 2z + 1, we obfain = fl[s/k). Comparing fhis fo Lemma lTdl 

we obfain sfafemenf 4 of Theorem o On fhe ofher hand, lef us show fhaf g{k, e) = 0(/c 1^1) for any 

k-1 

e. Indeed, if e > 0, fhen fhe Maclaurin series for e alfemafing and monofonely decreasing in 

absolufe value, which gives us g{k, e) = k~’^{l + o(l)). If e < 0, fhen 

g{k, s) = k~^^^ ^1 — [e~~iT ~)^ ^ > k~^^^ ^1 — e“ 2 ^ = 0(A:“l^l). 


For any n and fhe odd number m = 2{po + e) +1 which is fhe closesf odd infeger fo 2po +1, the absolute 
value of e is at most 1/2. Then for this m we have = 0(1). According to Remark l+Tl there is 
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a sequence (more precisely, one can take n* = + 2z — 1) such that = 0(1). Thus, we 

finished the proof of Theorem ll.ll S'). 

Note that the statement 2 of Theorem II .ll is not proved yet: from statements 3 and 4 it follows that 
the limit doest not exist for k big enough, while we have to prove this fact for all k. To do this, we need 
to tighten both upper and lower bounds. 


5. Tight two-sided bounds 

Lemma 5.1. With high probability, all borders of a randomly chosen palindrome of length m have 
lengths less than [log mj. 


Proof: 

By the definition of a border, any border of a palindrome is a palindrome. Thus, a palindrome has a 
border of a given length if and only if it begins with a palindrome of this length. A random word of 
length 2c or 2c+l is a palindrome with probability k~^. Hence, by the union bound, the probability for 
a random word to begin with a palindrome of length at least 2c is less then 


i=c 


2k 

k-l 


k-\ 


If we take c = [ *°|™' J, this probability will be 0{m Thus, a palindrome of length m has no 

borders of length at least 2 • [ *°|™' J < [log mJ with probability 1 — □ 

Now pick a palindrome w of length m at random. By Lemma 15.11 its border array w looks like 
10 • • • Ou, where |u| < [log mJ with high probability. Since w definitely has a one-letter border, |m| > 0. 
Therefore, Theorem 14. 11 2) and Lemma l4~2l allow us to take +1 and as the lower and 

the upper bound for f-w{x) when estimating Ayj{n) (the lower bound works always and the upper bound 
works with high probability). 

Now we take the function + x^, where the number c G {0,1,... , [log mJ} is unspecified, as 
fw, and compute Au,{n) from it. We have f!^{x) = {m — + cx'^~^. Similar to (l20l) and (1211) we 

obtain 


Otn — k 


1 


km-l ]^c 


(m — ^ + ck 

(^m-l _|_ 

= k- ^ 


:,c—1 


+ o( 


+ 


1 


j^m—1 J^2m—2—c J^2 


m 

1^ 

m — 1 ^ + 

) 26 

— 1 ) 5 ( / 


k3r 


= 


l_(A,_0^)2/;(0) )) (("^-1)C ' + C0S,')+O( 


m 


= 1 + 0 


m 


(27) 
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Next we substitute m = 2{pe + e) = 2{po + e) + 1, where e = 0(1). Recalling that = 0(m), we 
obtain, similar to (|2^ . (1241) . 


T 


I-a, 


= 1 - 


1 


n • 

= 1 — e 


+ 0 


log n 




'^ + O 


/ log 
V n 


n 


(28) 

(29) 


The resulting asymptotic formulas are independent of c. So (l29l ) gives the asymptotic value of a term in 
(fTSl) with high probability. All terms falling into the remaining small group can be bounded using (l24l) . 
which gives a formula equivalent to (l29l) up to a multiplicative constant. Hence we can substituite (l29l ) 
for all terms in (fTSl) . getting finally 


E(n, fe, m) 


Pal(fe,m)- (l-O^(^)") 




m is even, 

(30) 

m is odd. 


To extract the bounds on 


E(n,fc) 

y/n 


from (l30l) . we look at the function appeared as the coefficient of y/n. 


Remark 5.1. The function f{x) 


x{l — e ^/^^) behaves over the interval (0, oo) as follows: 


1- f{x) 1/x (up to a cubically small term) as x —)• oo; more precisely, for x > 1 one has 
/(^) = S- 2^ + ^-^’ where 0 < A < ^; 

2. /(x) ~ X (up to an exponentially small term — xe as x ^ 0; 

3. /(x) has a single maximum x ~ 0.6382 at the point xq ~ 0.8921 and is nearly constant around 
this point (e.g., /(I) = 1 — 1/e ta 0.6321). 

Now consider F{k,e) = X^S-ooRemark |5T] this series clearly converges, being 
bounded by the sum of two geometric series with the same denominator k~^. Furthermore, F{k,e) is 
periodic with the period 1 for any fixed k G N\{1}. 

To make the computation of the sum E(n, k) = Ylm=i xn) easier, we first discard most of its 

terms, leaving ^)’ some constant c. This produces an error of order k~^^‘^y/n 

(see Fig. [TJ cf. Remark IdTI) . Every term of the remaining sum can be computed by the formula (l30l) . 
Next we replace this finite sum with an infinite sum of terms (l30l) . taken for all e such that — oo < e < oo 
and either p^ + e ov po + e'is, an integer. By Remark lSTTl the sum we thus added is also of order k' 

Hence, we totally change E(n, k) by an amount of order k~^/‘^y/n. Since the constant c can be taken 
big enough, we can neglect this change in our considerations and identify E(n, k) with this infinite sum, 
getting 

E(n, k) K, {F{k, e)Vk + F[k,e + where Po{n, k) + e £ Z . (31) 

In order to prove Theorem 1 1.1( 21. it remains to show that the function F{k, e) has no period 1/2 for any 
fixed k G N\{1}. For this, let us first consider F{k, 0) and F{k, 1/2). From (l30l) and Remark [5H] we 
have 


nfc,o) = i-A^ 


+ 


2(/c3 - 1) 6(A:5 - 1) 


— A, where A < 


24(F - 1) ’ 


(32) 
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yielding F{k, 0)>l-i + ^- ^ > 3. Similarly, 


Then 


F(fc,l/2) 

F(A;,0) -F(/s,l/2) > 


^ 2^/k 
- k-l 


1 

1 - - + 
e 


^ 3/2 ^ 5/2 

~ 2(/c3 - 1) 6(fe5 - 1) 

2(1 - ^/fe) (A;3/2 - 1) 

k-l ^ 2(fe3 - 1) 



1 

6(A:3 - 1) ’ 


(33) 

(34) 


The difference (l34l) can be checked by hand or by computer-assisted symbolic computation to be positive 
for any k > 4. Hence, the function F{k,e) has no period 1/2 in these cases. This implies that no limit 
lim^^oo exists according to (1^ . The cases k = 2 and k = 3 require a separate analysis, but 

since k is fixed, this is feasible. It appears that in each case F{k, e) has a single maximum and a single 
minimum on any interval of length 1, and thus has no period 1/2. More detailed, max F(2,e) « 2.55775 
at the point xo ~ 0.398 and minF(2, e) « 2.55647 at the point xq ~ —0.103; maxF(3,e) ps 1.62212 
at the point xq ~ —0.251 and minF(3,e) « 1.60452 at the point xq ~ 0.255. This finally proves 
sfatement 2 and fhen Theorem ll.il 


Remark 5.2. The difference befween the maximum and the minimum in the binary case is really tiny; 
to prove its existence, all terms given in Remark l54T l.21 are essential. 

With all the bounds obtained, the following proposition is easy. 

Proposition 5.1. (1) lim^^oo C(A:) = 3 — 1/e. 

(2) limfc^oo C{k)/'/k = where x ~ 0.6382 is the maximum of the function /(x) = x(l — ) 

in the interval (0, oo). 

Proof: 

For statement 1, note that (l30l) gives us a coefficient of order for the number of odd-length 

palindromes and a coefficient of order for the number of even-length palindromes. So we can get 
a coefficient of order 0(1) only by taking a subsequence of n’s such that the corresponding e’s tend to 
1/2. In this case, even palindromes contribute 1 — 1/e -|- 0{l/k) and odd-length palindromes contribute 
2 + 0{l/k), whence the result. 

Let us turn to statement 2. Let eo = log xq, where xq is defined in Remark ISTTl Si. One can choose a 
subsequence of re’s such thaf fhe corresponding sequence of e’s converges to eq. Then the expectations 
E(re, A;,m), corresponding to these re’s and e’s, form a sequence, equivalent to as re —)• oo, see 

(1^ . On the other hand, the function x\/^ bounds any sequence of expectations E(re, k, m) from above. 
It remains to note that at most one term E(re, k, m) for a given re is proportional to while all others 
are proportional to k'^s/n for some c < 0. The result now follows. □ 


6. Numerical results and possible extensions 

Below we give, in TablelU the numerical estimates for some particular values of C_{k) and C{k) together 
with the corresponding values of e such that po + s is an integer and |e| < 1/2. We compare these 
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Table 1. Theoretical values of the constants C_{k) = liminf„_>oo ^ and C{k) = limsup„^^ the 

corresponding values of the distance e between Po(n, k) and the closest integer, and the experimental data on the 
number of distinct palindromes in random words of lengths fitting to the obtained values of e. 


k 

C{k) 

e 

C{k) 

e 

n 

PalSn/y/n 

n 

Palsn/Vn 

2 

6.17315 

-0.103 

6.17368 

0.398 

618843800 

6.17171 

1238545800 

6.17276 

3 

4.40121 

0.255 

4.41410 

-0.251 

8188445 

4.40052 

24940577 

4.41358 

4 

3.81315 

0.360 

3.85763 

-0.167 

24747862 

3.81195 

6657745 

3.85465 

5 

3.51925 

0.409 

3.60893 

-0.129 

13076560 

3.51834 

2914038 

3.60581 

6 

3.34259 

0.438 

3.48553 

-0.108 

2096750 

3.34202 

14840282 

3.48520 

10 

3.02693 

0.485 

3.41133 

-0.071 

1071524 

3.02544 

13842043 

3.41175 

50 

2.70152 

-0.485 

5.09183 

-0.032 

5877686 

2.70007 

160063 

5.08441 


numerical values against the experimental data on the palindromic richness of random words. The prob¬ 
lem of counting distinct palindromic factors in a word can be efficiently solved: see [4J for an offline 
algorithm and [81 for an online one. This makes possible the experiments with long random words. For 
each length, Tabled contains the average number of palindromes for 1000 experiments, divided by ^/n. 
The experimental data agree quite well with the theory; for longer words the agreement is better. We 
also mention a special situation with the binary alphabet: the difference C{2) — C_{2) is very small, and 
the values of e and e are “swapped” compared to bigger alphabets. 

Finally, we point out that the technique used in this paper can be applied to computing the expected 
numbers of other types of repetitions in random words. For example, it is quite easy to show that the 
expected number of squares in a k-ary word of length n is ^/n^, moreover, the ratio of this number and 
^/n tends to a constant as A: —oo. Indeed, squares are very much alike the even-length palindromes 
(e.g., the left graph of Fig.[T]suits for squares as well), and there is no analog of odd-length palindromes 
to disturb the general picture. The only significant difference between squares and even palindromes is 
in their borders: palindromes usually have only short borders, while a square of length n always has the 
border of length n/2, and with high probability has no longer borders. The corresponding difference 
in border polynomials affects the constant before the s/n term, but not the term itself (compare (1251) 
against (l30l) ). Thus, the analog of (l30l) can be obtained, with slightly different constant and without the 
alternative for odd-length palindromes. 
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